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Abstract 

Background: Biobanks are a critical resource for translational science. Recently, semantic web technologies such as 
ontologies have been found useful in retrieving research data from biobanks. However, recent research has also 
shown that there is a lack of data about the administrative aspects of biobanks. These data would be helpful to 
answer research-relevant questions such as what is the scope of specimens collected in a biobank, what is the 
curation status of the specimens, and what is the contact information for curators of biobanks. Our use cases 
include giving researchers the ability to retrieve key administrative data (e.g. contact information, contact's 
affiliation, etc.) about the biobanks where specific specimens of interest are stored. Thus, our goal is to provide an 
ontology that represents the administrative entities in biobanking and their relations. We base our ontology 
development on a set of 53 data attributes called MIABIS, which were in part the result of semantic integration 
efforts of the European Biobanking and Biomolecular Resources Research Infrastructure (BBMRI). The previous work 
on MIABIS provided the domain analysis for our ontology. We report on a test of our ontology against competency 
questions that we derived from the initial BBMRI use cases. Future work includes additional ontology development 
to answer additional competency questions from these use cases. 

Results: We created an open-source ontology of biobank administration called Ontologized MIABIS (OMIABIS) coded 
in OWL 2.0 and developed according to the principles of the OBO Foundry. It re-uses pre-existing ontologies when 
possible in cooperation with developers of other ontologies in related domains, such as the Ontology of 
Biomedical Investigation. OMIABIS provides a formalized representation of biobanks and their administration. Using 
the ontology and a set of Description Logic queries derived from the competency questions that we identified, we 
were able to retrieve test data with perfect accuracy. In addition, we began development of a mapping from the 
ontology to pre-existing biobank data structures commonly used in the U.S. 

Conclusions: In conclusion, we created OMIABIS, an ontology of biobank administration. We found that basing its 
development on pre-existing resources to meet the BBMRI use cases resulted in a biobanking ontology that is re-useable 
in environments other than BBMRI. Our ontology retrieved all true positives and no false positives when queried 
according to the competency questions we derived from the BBMRI use cases. Mapping OMIABIS to a data structure 
used for biospecimen collections in a medical center in Little Rock, AR showed adequate coverage of our ontology. 
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Introduction 

Biobanks are a critical resource in translational science, 
such as translational oncology, as they provide speci- 
mens essential to the identification of novel biomarkers 
for specific therapies [1]. Recent research has provided 
compelling examples of using semantic web technolo- 
gies, such as ontologies, to retrieve research-relevant 
data from biobanks [2,3]. However, [4] point out that 
little attention is paid to collecting data about the dif- 
ferent ways in which biobanks are organized. This lack 
is apparent in both of the ontologies considered by the 
authors of [2,3]: Neither the Ontology of Biomedical 
Investigation (OBI)^, nor the Translational Medicine 
Ontology (TMO)^ represent biobanks, biobank orga- 
nizations, or related entities. This situation makes it 
impossible to query biobanks with respect to orga- 
nizational structures, ownership of biobanks and speci- 
mens, and the curation status of specimens. Thus, our 
goal was to provide an ontology that represents the 
administrative aspects of the biobanking domain to 
enable querying biobank data from both the specimen 
or population perspective and the administrative pers- 
pective. Our ontology is called Ontologized MIABIS 
(OMIABIS) and is named after the Minimum Informa- 
tion About Biobank data Sharing (MIABIS) [5]. The 
latter provided the starting point for our ontology de- 
velopment. We recently released the initial version of 
OMIABIS coded in Web Ontology Language 2.0. It can 
be downloaded from http://purl.obolibrary.org/obo/ 
omiabis.owl. The ontology is open source and we invite 
the community to develop it further with us. 

In the background section we introduce MIABIS and 
its use cases. In the methods section we describe our 
approach to ontology development including the re-use 
of existing ontologies. In addition, we introduce our 
approach to testing the ability of the ontology to an- 
swer competency questions derived from our use cases. 
In the results section, we show the basic features of our 
ontology and present the results of our evaluation of its 
adequacy. Finally, we discuss future work and potential 
uses of the ontology, as well as its connections to 
ongoing efforts in biomedical ontology. 

Background 

Introducing BBMRI 

For an initial domain analysis we relied on the work on 
data integration done by the European Biobanking and 
Biomolecular Resources Research Infrastructure (BBMRI). 
During the so-called preparatory phase of BBMRI, be- 
tween 2008-2011, the initiative comprised 54 different 
partners across Europe and more than 225 associated 
organizations representing over 30 countries. One of the 
aims of the BBMRI is to provide the necessary formats to 
compare biobank information at different levels of detail 



[6]. Work on data integration within BBMRI used at 
least two approaches; a survey of the samples and data 
of European biobanks using questionnaires— resulting 
in the Catalogue of European Biobanks [7], and the de- 
velopment of a common information model for a hub- 
and-spokes structure for national or regional biobank 
nodes [8]. Because biobank data is often related to per- 
sonal health data, management and sharing must fol- 
low legal jurisdiction, according to Directive 95/46/EC 
in the European context. In combination with several 
other integration issues identified in [9], the estabUsh- 
ment of an information model for sharing biobank data 
on an international level will require future effort. In 
the meantime, and to meet the demand of the biobank 
community to understand what data should be stored 
in relation to biological samples, a minimum list of 
data attributes was drafted as one of the last activities 
in the preparatory phase of BBMRI. One of the activ- 
ities in the Swedish BBMRI, i.e., BBMRI.se, has been to 
continue the development of the minimum information 
Ust from the European BBMRI. The updated version is 
called MIABIS - Minimum Information About Bio- 
bank data Sharing - and consists of fifty-two attributes 
considered important for establishing a system of data 
discovery for biobanks and sample collections. To 
avoid legal issues related to individual subjects, cases 
or samples are not considered at present [5]. The attri- 
butes employ existing standards, e.g., the Sample 
PREanalytical Code (SPREC) [10], ICD Codes^ and 
definitions developed by the Public Population Project 
in Genomics (P^G)*^ and the International Society for 
Biological and Environmental Repositories (ISBER)^. 

Use cases for MIABIS & OMIABIS 

MIABIS was developed in the context of several use 
cases described by invited researchers as part of the 
BBMRI project. Our two example use cases stem from 
the development of MIABIS: 

a) Search for tissue samples from donors diagnosed 
with nemaline myopathy. Determine the age group. 
What are the sample storage conditions? Contact 
the biobank for detailed information about the 
biopsy samples and whether myoblast cell cultures 
have been grown from these samples. 

b) Search for sample collections having at least 10 
cases with tissue from the thoracic aorta as well 
as blood, serum, or plasma from the same 
donor. Also check if clinical data has been 
registered for the donors such as physical 
measurements. Contact the person responsible 
for the sample collection to obtain detailed 
information on the specific kind of thoracic 
aorta biopsies of interest. Also assure that the 
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biopsies were performed +/- one week in 
relation to the blood sampling. 

Use case b) would require inclusion of individual- 
level data. As mentioned above, the attributes for 
representing data about individual donors and speci- 
mens were dropped during MIABIS development due 
to regulatory issues. 

Already, MIABIS is being used in a structured Scandi- 
navian survey to gather information about sample col- 
lections stored in biobanks in a searchable database 
(www.bbmriregister.se). Increasing the total searchable 
information could include uploading new data directly 
to the existing system, and/or developing external data- 
bases that structure the information according to MIABIS. 
In the latter case, an ontologized version of MIABIS will 
be used to perform a federated search across the multiple 
databases. This search capability will minimize the effort a 
researcher must expend to search for sample collections of 
interest, by avoiding the need to query several separate da- 
tabases one by one. Hence, the University of Arkansas for 
Medical Sciences and Karolinska Institutet, representing 
BBMRI.se, decided to initiate a biobank ontology develop- 
ment project as a joint effort. 

Methods 

Our aim is to provide a semantically rich representa- 
tion of biobank administration to facilitate the sharing 
of biobank data. We based our development on an ana- 
lysis of the minimum requirements for sharing biobank 
data done within the BBMRI as captured by MIABIS. 
Hence, we named our ontology OMIABIS, standing for 
Ontologized MIABIS, To make the ontology easily ac- 
cessible and implementable, we chose Web Ontology 
Language (OWL) 2 [11] for implementation. To faciU- 
tate re-use and harmonization across ontologies, we 
used Basic Formal Ontology (BFO)^ as the upper ontol- 
ogy [12,13]. In addition, the entire ontology develop- 
ment followed the principles of ontology development 
as set forth by the OBO Foundry [14]^. 

Re-use of preexisting ontologies is key among the 
OBO Foundry principles. In creating OMIABIS we 
imported the Proper Name Ontology (PNO)*^ in its en- 
tirety. PNO is based on the Information Artifact Ontol- 
ogy (IAO)\ It is a formal representation of proper names 
based on Devitt's theory of designation [15]. Thus, 
OMIABIS is an extension of lAO. In addition, multiple 
entities from other ontologies, namely the Ontology of 
Biomedical Investigations (OBI)^ and the Ontology of 
Medically Relevant Social Entities (OMRSE)^^ are im- 
ported using a tool based on the MIREOT methodology 
[16], which was developed in a joint endeavor between 
the University of Arkansas for Medical Sciences and the 
University of Arkansas at Little Rock [17]. 



We chose to re-use the ontologies mentioned above 
based on the fact that they are members of the OBO 
Foundry and, thus, are built according to the same basic 
principles and extend the same upper ontology (BFO). 
Our aim is to create ontological representations that 
facilitate the integration of biobank administrative data 
with biomedical research data. The latter often is anno- 
tated with terms from Gene Ontology (GO) or OBI. 
Thus, choosing ontologies from the very same orthog- 
onal ontology library (OBO Foundry) of which the latter 
are members appears to be the best strategy to accom- 
plish this integration. 

All directly imported ontologies (BFO, PNO, lAO) will 
update automatically. MIREOT, so far, does not have a 
strategy for automated updates. However, the developers 
of the MIREOT plugin plan to include this functionality 
in a future release. 

In addition to these ontologies, the development of 
OMIABIS was informed by other pre-existing ontologies 
in the biobanking domain mentioned in the Discussion 
section of this paper. 

Because existing ontologies already represent speci- 
mens, clinical studies and populations, OMIABIS repre- 
sents the domain of biobank administration. Together 
with terms from these specimen-focused ontologies, 
OMIABIS needs to allow the level of semantic integra- 
tion required by the use cases described above. 

OMIABIS was developed using Protege 4.1.0, Build 239l 
The MIREOT Plugin is Version 1.0.1. The consistency of 
our ontology was verified using the HermiT 1.3.6 reasoner"^. 

To test the adequacy of our ontology for the BBMRI 
use cases (s. Background section) we derived a set of com- 
petency questions from them. Because the focus of the 
ontology is the administrative aspects of biobanks, the use 
cases entail some competency questions that fall outside 
the scope of our ontology at this point (namely all ques- 
tions related to the different donor subpopulations). 

The competency questions we address and evaluate 
here are: 

• Which biobanks hold frozen specimens? 

• Which biobanks hold blood, plasma and serum? 

• Which blood plasma specimens are owned by one 
specific biobank organization? 

• Which departments of a specific university have 
members that are serving as biobank contacts? 

• What are the e-mail addresses of all biobank contact 
persons at one specific biobank organization? 

These competency questions were approved by the 
domain experts from Karolinska Institute. 

To perform DL queries that test the adequacy of the 
ontology to retrieve data that answer the competency 
questions, we populated an OWL file (that imports 
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OMIABIS) with instances or individuals from a made- 
up biobank example. In OWL it is possible to represent 
the individual members of classes. OMIABIS per se 
does not represent any individuals, but it imports 326 
individuals from GEO that represent nations and their 
administrative subdivisions (to enable capture of the 
mailing addresses of biobank contacts). We included 
both true positives and false positives to the instance- 
level OWL file, to ensure that queries did not retrieve 
incorrect information. This file is called CompetencyTest. 
owl, and can be downloaded from: http://omiabis-dev. 
googlecode.com/svn/branches/CompetencyTest.owl. In 
addition, we submitted the file to this journal as 
Additional file 1. 

The actual queries we ran together with the results 
can be found in Table 1. 

Results 

Implementation of OMIABIS 

The latest release of OMIABIS in OWL can be down- 
loaded from the permanent URL http://purl.obolibrary. 
org/obo/omiabis.owl. In our research we focused on 
representing the MIABIS data attributes focused on 
biobanks and studies/sample collections, which com- 
prises all classes and object properties closely related to 
administrative aspects. 

The central class of any biobank ontology ought to be 
the class of biobanks or biorepositories. MIABIS differ- 
entiates biobanks from the organizations that own them. 
Accordingly, OMIABIS defines "biobank" as follows: "A 
biobank is a collection of samples of biological sub- 
stances (e.g. tissue, blood, DNA) which are linked to 
data about the samples and their donors. They have a 
dual nature as collections of samples and data." The def- 
inition is derived from the definition for human biobank 
in [18]. The latter does not define "biobank" in general, 
but we generalized their definition to be applicable to 
any kind of biobank. The class is formally restricted to 
be the equivalent of: 

obj ectaggr egate 

AND has _part SOME 

(obj ectaggregate 

AND {has _part ONLY 

(specimen AND participates _in SOME storage))) 
AND has _part SOME (material information bearer 
AND {participates in SOME 
(digital curation AND 

{has_specified_output SOME 

(data set AND {is about SOME 
(obj ect_aggregate 
AND {has _part ONLY specimen))))))))" 

Notably, the biobank as such is neither an organization 
nor a facility, but the aggregate of the specimens and the 
data regarding these specimens. OMIABIS also represents 



"biobank organization". Its textual definition is: "A biobank 
organization is an organization bearing legal personality 
that owns or administrates a biobank". "Biobank orga- 
nization" is equivalent to: 

"organization 

AND {{owns SOME biobank) 
OR {administrates SOME biobank)) 
AND {bearer _of^OME legal person role)" 

Referring to the class "legal person role" from 
OMRSE is necessary due to the fact that the definition 
of organization in OBI does not refer to legal 
personality^. Any group of human beings that has 
some organizational rules fulfills the textual definition 
according to OBI. However, for our use case legal per- 
sonality is crucial, since within the BBMRI framework 
we are concerned with management of certain rights 
and obligations, which are held by legal persons. The 
formal description of biobank organization uses two 
object properties which have been specifically created 
for OMIABIS: 

1. ''owns' 

Elucidation: This is a primitive relation. This relation 
is the foundation to the owner s right to have the owned 
entity at his/her full disposal. 

Domain: Homo sapiens 

OR organization 

OR collection of humans 

OR aggregate of organizations 

Range: information content entity 
OR material_entity 

Characteristics: asymmetric 

The elucidation for this primitive relation is based on 
Reinach's legal ontology [19]. For further material on the 
ontology of claims and obligations see [20]. 

2. "administrates' 

Definition: "a administrates b if c owns b and some 
rights and obligations grounded in the owning relation 
regarding b are transferred^^^ from c to a." 

Domain: Homo sapiens 

OR organization 

OR collection of humans 

OR aggregate of organizations 

Range: information content entity 
OR material_entity 

Characteristics: asymmetric 

OMIABIS includes a total of 249 classes and 64 object 
properties. Of the 249 classes 34 classes are restricted 
by an equivalent class axiom. 35 classes and object 
properties were newly created for the initial version of 



Brochhausen et al. Journal of Biomedical Semantics 201 3, 4:23 Page 5 of 9 

http://www.jbionnedsenn.conn/content/4/1/23 



Table 1 DL Queries executed on the Competency Test OWL file and results 

Competency question DL Query Recall Precision Ratio Reasoning 

time (in ms) 

Which biobanks hold frozen specimens? biobanl< and has_part some 'frozen specimen' 100% 100% 6/6 76.9 

Which biobanl<s hold blood, plasma biobankand has_part some 'blood plasma specimen' and 100% 100% 5/5 53.2 

and serum? has_part some 'blood serum specimen' and has_part some 

'blood specimen' 

Which blood plasma specimens are owned 'blood plasma specimen' and part_of some (biobank and 'is 100% 100% 6/6 45.4 
by one specific biobank organization? owned by' some {'Unseen University'}) 

Which departments of a specific university department and 'has organization member' some (bearer_of 100% 100% 6/6 30.2 
have members that are serving as some 'biobank contact role') 
biobank contacts? 

What are the e-mail addresses of all 'email address' and 'is contact information about' some 100% 100% 6/6 55.2 

biobank contact persons at one specific (bearer_of some 'biobank contact role' and 'is member of 
biobank organization? organization' some {'Unseen University'}) 

Each query is encoded in a separate test method in Scala using the Java OWL-API. Each method reloads the ontology, builds the necessary axiom from API calls, 
adds the axiom, executes the query, and removes the axiom from the ontology. Each trial was conducted by running each method 100 times and reporting the 
average running time in milliseconds. Ten trials were conducted. 



OMIABIS. A textual definition is given for all newly 
created classes and object properties. Figure 1 shows a 
semantic network for the central classes of OMIABIS 
and how they are used in retrieving data matching the 
competency questions. 

The OMIABIS labels tend to be very long, since they 
are referring to the ontological hierarchy. However, we 
foresee that for future use cases we might add more and 
shorter labels for those classes to accommodate devel- 
opers and users. In OMIABIS the MIABIS attributes are 
given as "alternative name" for the class in question. 

Performance of OMIABIS regarding the competency 
questions 

Table 1 shows the DL queries we executed using the DL 
query tab of Protege and their results. The test ontology 
based on OMIABIS and populated with example individ- 
uals performed flawlessly in answering the competency 
questions as specified in the Methods section. 

Discussion 

OMIABIS in relation to pre-existing efforts in 
biobank ontology 

Ontologies have been identified as a key technology to 
overcome the lack of semantic integration of biobank- 
related data [21]. [3] demonstrates how pre-existing on- 
tologies, namely the Ontology of Biomedical Investiga- 
tion (OBI) [22] and BioTop [23] can be efficiently used 
to represent data regarding samples and sample curation 
in a semantically rich way. The methodology used, and 
the criteria applied, by [3] overlap with our approach to 
ontology development. In our research we focused on 
administrative data regarding biobanks, sample collec- 
tions, and studies producing sample collections, whereas 
[3] focuses on individual specimens or samples. We plan 
to use a similar approach and integrate their work in sub- 
sequent research that will address the issue of properties 



of individual samples [24]. Developed an ontology-based 
architecture to integrate data from heterogeneous 
biobanks by unifying metadata. Since the outcome of their 
development is not open source, we contacted the devel- 
opers and aim to cooperate with them on the OMIABIS 
project. 

Another ontology that represents biobanks/biorepo- 
sitories is the eagle-i resource ontology (ERO)^, which was 
created for the eagle-i project. The aim of the eagle-i pro- 
ject is to "create a searchable inventory of unique, rare or 
otherwise hard-to-find biomedical resources ... to foster 
sharing and linking of resources in the larger scientific 
community". ERO is used to integrate data about biomed- 
ical resources and make the search functionality more 
flexible [25]. However, due to its use case ERO is relatively 
sparse with respect to axiomatic representation of its clas- 
ses. Our goal was to provide a semantically rich ontology 
that allows extensive reasoning, so re-use of ERO classes 
was not an option. In addition, we found ambiguities and 
lack of clarity in its representation of biobanks, specifically 
the fact that it defines biobank organization instead of 
biobank.. We have since begun collaborating with the 
ERO developers on the branches of ERO related to 
biobanks and their management. 

Performance of OMIABIS regarding the competency 
questions 

The fact that all true positives were retrieved and none of 
the false positives was, hints to the fact that the ontology 
performs well. Based on our timing results when running 
the queries, we suspect that the axiomatic definition of 
"biobank" (given in Results section) is computationally 
"expensive". Relatively simple queries that used this class 
ran slower that complex queries that did not refer to it. 

We are aware that the number of individuals in the 
competency test ontology is small. Both (1) the initial 
use cases from BBMRI and (2) the usage of OMBIABIS 
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Figure 1 Illustration of the central OMIABIS classes. The figure shows the central classes of OMIABIS and the object properties connecting 
them. Light blue rectangles are classes; light blue arrows are object properties. Dark blue circles and edges represent instances that can be 
retrieved using OMIABIS. 



in i2b2, which we present below, include federated search 
in multiple databases. This raises the question of how the 
ontology will be used to query across large data sets. Our 
scenarios focus on researchers retrieving data about pos- 
sible sources of specimens (BBMRI) or specific specimens 
(i2b2) to do research. This task is part of a study's plan- 
ning phase. It is not related to patient-related activities or 
the performance of lab work. Thus, we believe, it is rea- 
sonable to provide the researcher with the benefit of a fed- 
erated search at the cost of speed. The query results could 
be sent to the researcher once they are available. There 
does not seem to be the need for immediate recall. None- 
theless, we do want to keep reasoning time to a minimum 
once we start running queries on large data sets. We 
therefore plan to implement or develop methods to 
ensure timely recall. 

Ontological challenges regarding the MIABIS attribute 
"biobank type" 

Taking into consideration the immediately biobank- 
related attributes in MIABIS, we found one attribute to be 
challenging from the perspective of ontology develop- 
ment: biobank type. Among the values for this attribute in 
MIABIS are for example Pathology, Cytology, Gynecology 
etc. There are strong indications from MIABIS users that 
this list is not exhaustive. The rationale behind this attri- 
bute and its current values is to allow the person submit- 
ting data about a biobank to easily select something that 
seems plausible to her. However, the downside of this ap- 
proach is a certain difficulty for end users to find relevant 
biobanks and studies for her research. The possible values 
for biobank type in MIABIS are under elaboration and 



will be updated as time progresses. A particular specimen 
collection, by virtue of the type of specimens stored, 
might be of interest to both pathologists and virologists, 
or gynecologists and cytologists, and so on. In order to 
provide useful ontological representation of these classes 
we need users to specify which characteristics of a 
biobank make it useful for which specialty of medicine or 
which research domain. 

Using OMIABIS to annotate data in i2b2 

In addition to putting OMIABIS to use within the BBMRI 
framework, we plan to use it for biobank data manage- 
ment at University of Arkansas for Medical Sciences 
(UAMS) and the Arkansas Children's Hospital Research 
Institute (ACHRI). UAMS has a Tissue Procurement 
Facility and several, relatively smaller individual research 
labs (i.e. the Myeloma Institute, the "Spit for the Cure" 
Project). In addition, ACHRI has several independent labs 
similarly managing specimens, including the Center 
for Birth Defects Research, Section of Developmental- 
Behavioral and Rehabilitative Pediatrics (autism research), 
and the Women's Mental Health Program. Both UAMS 
and ACHRI would like to share their collected specimens 
and annotated data for research purposes while keeping 
the operations of each lab independent. Recently, UAMS 
created an Enterprise Data Warehouse (EDW) to facilitate 
access to and integration of clinical, basic-science, and 
other data for research and quality reporting. Retrieving 
de-identified data from the EDW is done using Informa- 
tics for Integrating Biology and the Bedside (i2b2) [26,27], 
an open-source software application. i2b2 was designed 
primarily for cohort identification, allowing users to 
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perform queries to determine the existence of a set of 
patients meeting certain inclusion or exclusion criteria. 
Researchers have requested adding the ability to search 
for specimens to the data warehouse. 

To ensure semantic integration of data from multiple 
biobanks with research relevant patient data, i2b2 requires 
an ontology to which the data will be mapped in i2b2's 
Ontology Cell. Because the management, the operations, 
and the data collected in the biobanks are heterogeneous, 
manual mapping of the data into a single i2b2 instance is 
a challenge. Instead, a federated architecture where quer- 
ies are distributed to individual nodes and the results 
merged is the more promising approach. This approach 
requires a common ontology like OMIABIS. 

Currently the biobanks at UAMS use caTissue [28], an 
open-source biospecimen management tool. caTissue is 
developed under the cancer Biomedical Informatics Grid 
(caGRID) initiative of the National Cancer Institute (NCI). 



It facilitates the process of locating and analyzing tissue 
specimens by cancer researchers based on clinical, tissue, 
and genomic characteristics. caTissue Annotation forms 
store clinical and other related data about specimens. Also 
called Dynamic Extensions, this component allows the 
creation of new forms that contain fields a site wishes to 
collect about each specimen. 

Despite using a single software application, integration 
of data is not guaranteed in this approach because each 
biobank creates its own specimen annotation forms with 
different data elements. To ensure and optimize seman- 
tic integration, we will incorporate an ontology into 
caTissues annotation forms for all UAMS/ACHRI bio- 
banks and the biobank administration data model. Then, 
the data in separate caTissue instances for the biobanks 
can be easily incorporated into the EDW i2b2 instance, 
and queried with common semantics. The researchers run- 
ning the EDW have identified OMIABIS as the ontology it 



, , , , ^ , , , , Code formed by appending unique 

biobanl< idenfier with ISO 3166-1 alplia-2 code ► , . , , . . ^ .r^n-i^^i i u ^ _i r 

^ biobanl< id to ISO 3166-1 alpha-2 code for country 

biobank proper nanne ► Site Name 

biobank organization ► Site Coordinator Institution 

biobank unique resource locator ► Site URL 

biobank location ISO 3166-1 alpha-2 code ► Site Country 

biobank contact person ► Site Coordinator 

biobank contact's telephone number ► Site Phone 

email address of biobank contact person ► Site Coordinator Email 

biobank contact's department — ► Site Coordinator Department 

intra-settlement specification part of biobank contact's address ► Site Address/Street Name 

biobank contact's postal code ► SiteZipcode 

settlement part of biobank contact's address ► Site City 

biobank contact person's ISO 3166-1 alpha-2 code ► Site Country 

hosted study collection ► Collection of all collection protocol titles 

sample collection or study identifier ► Collection Protocol IRB Identifier 

study proper name ► Collection Protocol Title 

study principle investigator or responsible — ► Collection Protocol Principal Investigator 

sample collection or study contact person ► Collection Protocol Coordinator 

sample collection or study contact's telephone number ► Collection Protocol Coordinator Phone 

sample collection or study contact's email address ► Collection Protocol Coordinator Email 

sample collection or study contact's department ► Collection Protocol Coordinator Department 

intra-settlement specification part of sample collection or study contact's address ► Collection Protocol Coordinator Address 

sample collection or study contact's postal code ► Collection Protocol Coordinator Zipcode 

settlement part of sample collection or study contact's address ► Collection Protocol Coordinator City 

sample collection or study contact's ISO 3166-1 alpha-2 code ► Collection Protocol Coordinator Country 



Figure 2 Mapping between OMIABIS classes and caTissue data elements. 
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will use for biobank data. Figure 2 shows the mapping of 
OMIABIS terms to caTissue data elements previously used 
by UAMS' EDW. 

McCusker et al [29] have studied an option that would 
convert NCIt curated Unified Modelling Language (UML) 
annotations to OWL using semCDI. semCDI query for- 
mulation uses a view of caBIG semantic concepts, meta- 
data, and data as an ontology [30]. The result was that 
OWL annotation properties are used to represent meta- 
data on OWL constructs and are not considered for rea- 
soning purposes. So, McCusker et al. have indeed created 
their own UML-to-OWL transformation that does not 
model attributes as datatype properties and does not 
model NCIt annotations of UML classes using subsump- 
tion. This methodology limits the expressivity and limits 
reasoning ability. In addition, this approach did not 
consider multiple biobanks. 

To fulfill all requirements of biobank data integration 
within the UAMS/ACHRI framework, in the future 
OMIABIS representations will need to be integrated with 
ontologies representing individual specimens and donors. 

Our next step is to cooperate with other biobank pro- 
jects and biobank ontologies to extend OMIABIS and to 
work towards a domain ontology for biobanking as a 
whole. OMIABIS will be curated and maintained as an 
open-source artifact using subversion on an ongoing 
basis, with periodic releases of new versions. 

Conclusions 

In conclusion, we created OMIABIS, an ontology of 
biobank administration. We found that basing its devel- 
opment on pre-existing resources to meet the BBMRI 
use cases resulted in a biobanking ontology that is 
re-useable in environments other than BBMRI.. With re- 
spect to answering the competency questions, our quer- 
ies against an OMIABIS -based ontology, populated with 
a small set of hypothetical test cases, retrieved only true 
positives and did not miss any true positives. In addition, 
the mapping to a pre-existing data structure in the 
open-source caTissue application used for biospecimen 
collections in a medical center in Little Rock, AR dem- 
onstrated the adequacy of the coverage of OMIABIS. 

Endnotes 

^http://purl.obolibrary.org/obo/obi.owl 

^http://translationalmedicineontology.googlecode.com/ 
svn/trunk/ontology/tmo.owl 

*^http://apps.who.int/classifications/icdl0/browse/2010/en 

*^The Public Population Project in Genomics (P^G): 
http://www.p3g.org. 

^he International Society for Biological and Environ- 
mental Repositories (ISBER): http://www.isber.org. 

^Basic Formal Ontology (BFO): http://if0mis.0rg/l.l 



^Principles of the OBO Foundry: http://obofoundry. 
org/crit.shtml 

^The Proper Name Ontology (PNO): http://purl. 
obolibrary.org/obo/iao/pno.owl 

^The Information Artifact Ontology (lAO): http://purl. 
obolibrary.org/obo/iao.owl 

^The Ontology of Biomedical Investigation (OBI): 
http://purl.obofoundry.org/obo/obi.owl 

^^The Ontology of Medically Related Social Entities 
(OMRSE): http://purl.obolibrary.org/obo/omrse.owl 

^The Protege Ontology Editor and Knowledge Acquisi- 
tion System: http://protege.stanford.edu 

"^HermiT Reasoner: http://www.hermit-reasoner.com 

^classes printed bold, object properties in italics and 
OPERATORS all caps. Definitions of classes referred to 
here can be found in Table 1 

^Note that this class description is based on object 
properties and classes from BFO, lAO and OBI. 

Phttp://purl.obolibrary.org/obo/OBI_0000245 

*^The 'transfers' object property is represented in 
Document Acts Ontology (d-acts): http://purl.obolibrary. 
org/obo/iao/d-acts.owl 

^The eagle-i Resource Ontology (ERO): http://purl. 
obolibrary.org/obo/ero.owl 

Additional file 



Additional file 1: OMIABIS Competency Test. 
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